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ABSTRACT 



Transient radio signals of astrophysical origin present an avenue for studying 
the dynamic universe. With the next generation of radio interferometers be- 
ing planned and built, there is great potential for detecting and studying large 
samples of radio transients. Currently-used image-based techniques for detecting 
radio sources have not been demonstrated to be optimal, and there is a need for 
development of more sophisticated algorithms, and methodology for comparing 
different detection techniques. A visibility-space detector benefits from our good 
understanding of visibility-space noise properties, and does not suffer from the 
image artifacts and need for deconvolution in image-space detectors. In this pa- 
per, we propose a method for designing optimal source detectors using visibility 
data, building on statistical decision theory. The approach is substantially differ- 
ent to conventional radio astronomy source detection. Optimal detection requires 
an accurate model for the data, and we present a realistic model for the likelihood 
function of radio interferometric data, including the effects of calibration, signal 
confusion and atmospheric phase fluctuations. As part of this process, we derive 
fundamental limits on the calibration of an interferometric array, including the 
case where many relatively weak "in-beam" calibrators are used. These limits 
are then applied, along with a model for atmospheric phase fluctuations, to de- 
termine the limits on measuring source position, flux density and spectral index, 
in the general case. We then present an optimal visibility-space detector using 
realistic models for an interferometer. 

Subject headings: methods: statistical — radio continuum: general — techniques: 
interferometric 
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1. Introduction 

The next generation of wide-field survey radio interferometers (MWA, ASKAP, ATA, 
LOFAR, LWA), culminating in the Square Kilometre Array (SKA), faces new challenges 
in meeting the ambitious science goals of the twenty-first century. These goals demand 
advances in telescope engineering and data-processing design, as well as sophisticated and 
novel observational techniques. Wide-field instruments will generate data at high rates, 
and therefore require techniques that optimally use the data. One of the main science goals 
of these instruments is to detect transient and variable radio sources, and as such a key 
requirement will be optimal source detection. 

The time domain, historically dominated by pulsar observations, is broadening to study 
new classes of dynamic sources. New high-sensitivity, large collecting area instruments 
afford the opportunity to detect and study transient signals. One group of well-known 
radio transients are pulsars. The periodicity of these sources is used to detect them. The 
more general class of transients sources, with non-periodic behaviour (episodic), has not 
been surveyed and studied systematically. As well as the knowledge we can gain from 
studying expected transient sources (e.g., GRBs, pulsating stars), there is great potential 
for observing exotic astrophysical events such as annihilating black holes, gravity wave 



event s (e.g., colliding black holes), magnetars and extraterrestrial signals ( iMacquart et al. 



20101). Optimal source detection has a role for detection of both fast and slow transients. 



Detection of signals in radio astronomy typically occurs in the image domai n, and 



uses either a simple highest-peak thresholding, or a matched filtering operation (jCordes 



20091 ). The former sets a threshold value above the noise level, and attempts to detect the 
strongest signal in an image, and then model the incompleteness to remove its sidelobes. 
The next strongest signal is then compared to the threshold and modelled, and the process 
continues until there are no more signals above the threshold. This process works well for 
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strong signals that are well-separated (no source confusion), and well-understood noise. 



Matched filtering is an operation that is derived from statistical decision 



theor y. The 



19981). It 



matched filter correlates the received data with a replica of the signal ( iKayl 
weights the data according to the signal strength, thereby allowing the datapoints with 
the strongest signals to contribute more to the filter output. The matched filter is optimal 
for white gaussian noise (uncorrelated n oise) and k nown signal. For non-Gaussian noise. 



the matched filter is no longer optimal ( iKay 



19981 ). Hence, the matched filter is not 



necessarily the best detector that can be designed for source detection in image-space 
data. In addition, to perform a matched filtering operation with sources with unknown 
parameters (e.g., sky position, strength), radio astronomers typically correlate (match) 
the data with a s et of pre-dete rmined templates, to find the template that produces the 



maximal output (ICordes 



20091). Any inaccuracies in the templates compared with the 
true signal will degrade performance. Matched filters are applied in static fields, and their 
application becomes problematic for dynamic datasets: the loss in detection performance 
due to a spatial mismatch of the filter can be compounded by a temporal mismatch. 

Recent radio surveys (e.g., FIRST, NVSS, SUMSS) have employed matched filter 
plus flux limit threshold detectors, typically fitting gaussians with free parameters as the 
signal filter, and choosing a flux limit based on the estima ted noise level of the dataset 



(IBecker et al 



1995 



Condon et al. 



1998 



Mauch et al. 



20031). 



Detection of sources in image space can be problematic, firstly because it requires 
deconvolution of the i mage prior to detect io n: over-, unde r- a nd inaccura t e clea ning can 



lead to biased images (I Condon et al. 



19981 ). IPerlevI (11999[ l and 



Rau et al. 



(120091 ) discuss 



the origin and magnitude of errors in synthesis images. In general, the many-to-one fourier 
operation performed on visibility data to obtain image data propagates any non-gaussianity 
in the {uv) data to the image plane, producing correlations in the noise between pixels 
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across the field (iRefregier and Browrull998l ). Imperfect calibration and atmospheric effects 
yield deviations from gaussianity in visibility data. In addition, images contain structured 
backgrounds: source sidelobes, gridding artifacts (although, these are minimal for snapshot 
observations), and, in some cases, confusing sources. The presence of source sidelobes due 
to data incompleteness confounds identification of true sources. Differencing of temporally 
adjacent images to detect transient sources is complicated by non- uniform image pixel size 
and changes in the sidelobe distribution due to the different uf-plane sampling. It can 
also lead to artifacts when subtracting one image with a complicated noise structure, from 
another with a different noise structure. Finally, removal (flagging) of baselines changes the 
beam shape, yielding subtraction artifacts in the image. Although data incompleteness and 
non-gaussian noise also exist in visibility data, in that space we can explicitly consider only 
the measured data in our detector, and model the deviations from gaussianity: detectors in 
image space work on the intensity of image pixels, and cannot account for signal that is 
inferred by the image production process. 

We do not mean to sugges t that practical and useful source d etection cannot be 



performed in the image plane. 



Wijnholds and van der Veen 



( I2OO8I ) describe how to 



propagate errors in visibility s pace to image space. This type of technique (and others 



such as bootstrapping, used by 



Kemball et al. 



2010 



as a tool to assess radio image fidelity) 



can be used to extend an understanding of the statistical properties of the data into image 
space. We do, however, regard visibility space as a more natural space for optimal detection, 
because the data are in a form closer to the original signals collected by the antennas, 
the covariance structure can be more simply expressed, and no data are inferred through 
interpolation in the image plane and extrapolation from the uv plane (as is the case with 
forming an image from incomplete uv data). 

An emerging application of general source detection is detection of transient sources. 
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Transient source detection and characterization are key science drivers for many synthesis 
imaging arrays that are under construction. The Murchison Widefield Array (MWA) 
and Austrahan SKA Pathfinder (ASKAP) are two SKA precursor instruments under 
construction in the Western Austrahan desert. ASKAP is an array of 36 12- metre dish 
antennas, currently being constructed at the Murchison Radio-astronomy Observatory 
(MRO), Western Austraha, and operated by CSIRO. The MWA wiU probe much of the 
parameter space of interest to the SKA. It is also under construction at MRO, will comprise 
512 tile-type antennas with multiple dipole sensors per tile, and is being constructed by 
a consortium of local and foreign institutions and government agencies. The MWA and 
ASKAP use fundamentally different hardware for signal reception and processing, and 
operate in different frequency ranges (ASKAP: 0.7-1.8 GHz, MWA: 80-300 MHz), thereby 
producing complementary data sets, and pursuing different science goals. The MRO is 
Australia and New Zealand's candidate site for the core of the SKA. 

The MWA and ASKAP are both wide-field instruments, and will both contend with 
variations in calibration across the field-of-view. As such, they will make use of 'field-based' 
calibration, whereby calibration sources across the field are used to form a model for 



the antenna beam (IKassim et al. 



20071 ). The operating frequencies of both instruments 
also make them subject to atmospheric and ionospheric effects on the signal wavefront. 
These effects include blurring of the source position, due to phase fluctuations from the 
troposphere (ASKAP), and source s hifting due to differen tial excess path length to antennas 



produced by the ionosphere (MWA, 
faced by wide-fleld instruments. 



Mitchell et al. 



20081 ). These are some of the challenges 



Design of transient detectors is relatively new. In general, expected transient source 
populations will be dominated by weak sources and therefore extraction of sources close to 
the noise limit will generate the most new and interesting science. Hence, transient detector 



- 7- 



design is an important field of researcli. Tlie Allen Telescope Array (ATA) has recently 



repor ted the initial development of their slow transient detection pipeline (I Croft et al. 



20101 ). At this preliminary stage, they are matching known catalogues with sources in their 
fields, and have not found any new convincing transient candidates. Within the ASKAP 
project, the same fields of sky will be observed periodically to detect slow transients with 
the VAST survey. An image of the sky will be produced periodically, and these images 
searched for all sources. Any detections will be added to a searchable database, from 
which light-curves of objects can be extracted. Transient signals will necessarily show 
brightness variability over time. The current method being proposed to detect sources 
is Duchampi^. Duchamp uses either a simple thresholding to find sources, or a more 
sophisticated algorithm based on statistical decision theory. The downside to the method 
is the use of global parameters, and assumed white gaussian noise properties to define the 
PDFs. Noise properties can be assumed to be uniform for small fields, but may vary greatly 
over the field for the large fi elds-of-view sara pled in ASKAP and the MWA. In addition, the 
technique is not 'real-time'. iFridmanI (120101 ) has recently proposed a method for detecting 



single fast transient events from single-dish datasets, using a cumulative signal method 
based on statistical decision theory. 

Optimal detection of a signal relies on our knowledge of the properties of the signal — 
location, shape and amplitude. For signals with unknown parameters (e.g., transient radio 
sources), the detection performance is governed by our ability to accurately and precisely 
estimate the parameter values, and accurately model the data likelihood function. It is 
therefore crucial to understand the fundamental estimati on limits of a particular instrument . 



before proceeding to determine the detection limits (see 



Wijnholds and van der Veen 



2008 



for a recent review of fundamental radio imaging limits). In this paper, we describe how to 



^Software and user guide available at http://www.atnf.csiro.au/people/MaUhew. Whiting /Duchamp 
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design an optimal source detector with visibility data. We then describe the form of this 
detector for realistic interferometers, including the effects of imperfect calibration, signal 
confusion and atmospheric phase fluctuations. As part of this process, and to investigate the 
detection limits of instruments, we derive fundamental estimation limits for measurement 
of source parameters with interferometers. In Paper II we explore particular algorithms 
for source detection with real interferometers and simulated and real datasets, using the 
estimation limit results and theory from Paper I. We particularly focus on the problem of 
optimally detecting slow radio transients, although the methods presented here are generally 
applicable to source detection. As such, the estimation precision results we derive are 
based on short integrations (8-lOs), appropriate for the instruments under consideration, 
for which a transient detection test is performed at each output of the correlator. 

In section [2] we introduce statistical decision theory, including methodology for 
designing an optimal detector. We then introduce the Cramer-Rao lower bound (CRB) on 
estimation precision, as a metric for evaluating the source parameter measurement precision 
of interferometers. We then discuss detection of a single point source (section [3?T|) . a single 
point source embedded in a field (section [3^ . and a single transient point source embedded 
in a field (section l3.3p . with a visibility dataset and thermal noise. In section W7L\ we discuss 
the effect of calibration errors, source confusion, and atmospheric phase noise on signal 
estimation and detection, and in section |5] present a realistic detector for visibility data. 

2. Statistical decision theory 

2.1. Neyman-Pearson test and simple hypothesis testing 

Statistical decision theory is the branch of mathematical statistics that describes the 
detection of signals in noise. Signal detection is underpinned by hypothesis testing: in 
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the binary case, this means deciding between two hypotheses (signal present and signal 
absent). For a given set of observational data, the likehhood that data was obtained from 
each hypothesis is calculated, and the ratio of these probabilities is used to compare to 
a threshold. If the ratio is greater than the threshold, we decide that a signal has been 
detected. The value of the threshold is set according to the tolerance on the false alarm rate 
(rate of false positives or misses). This is a likelihood ratio test (LRT), and is applicable for 
a deterministic signal in known noise. The likelihood is the probability of the data given a 
set of parameters. 

If the null hypothesis (signal absent) is denoted Hq, and the alternative hypothesis 
(signal present) is denoted Hi, then the hypotheses can be written as: 



Hi:x[n\ = {n^l,...,N) 

Hq : x[n\ = w[n], 



(1) 



where s[n] is the known deterministic signal we wish to detect, and w[n\ is the known noise. 
Under these two hypotheses, the likelihood ratio test for the dataset x[n\ decides a signal is 
present if, 

(2) 



nx)^^>A, 



L(x;i7o) 

where T'(x) is the test statistic, A is the threshold, and L(x) is the likelihood function 
(probability distribution function parametrized by the model parameters). For example, if 
the signal is a DC level. A, in white Gaussian noise (WGN), iV(0, a^), the LRT is: 







N 




exp 








n=l 








N 




exp 










n=l 





> A. 



(3) 



Taking the logarithm of both sides, and incorporating non-data terms into the threshold. 
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we decide Hi if, 



(4) 



n=l 



In this simple case, the test statistic is the sample mean, which makes sense intuitively. 

We wish to maximise the probability of detection, subject to a given probability of 
false detection. Mathematically, the probability of detection, is the chance of deciding 
Hi when the datum is actually drawn from Hi, and is given by. 



where -Ri is the region of the likelihood function above the threshold. The false positive 
rate is given by. 



For a given probability of false detection, a, the Neyman-Pearson theorem states that the 
probability of detection, P^, is maximised when we decide Hi according equation [2J Hence, 
for a given tolerance of false positive signals (chosen according to the detection task and 
the implications of detecting false positives), the threshold and detection performance are 
defined. The LRT therefore yields an optimal detector (Pp maximised for a given Pfa)- 

To design an optimal detector, the likelihood function needs to accurately describe the 
data. In the following sections, we describe how to accurately model (i) the signal, (ii) the 
noise and background properties of interferometric data. 




(5) 




(6) 



2.2. 



Generalised Likelihood Ratio Test 



The likelihood ratio test described in the previous section can be applied to known 
deterministic signals with additive known noise. In the case where some parameters are 
unknown, the values of these parameters need to estimated before proceeding (e.g., source 
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position, flux density). To achieve this, the hkehhood function is maximized with respect 
to each of the unknown parameters, and the LRT is evaluated at the maximum likelihood 
estimates (MLE). MLEs are asymptotically efficient: for large datasets, they achieve 
the optimal estimation precision and are unbiased. The hypothesis test is then termed 
the Generalised Likelihood Ratio Test (GLRT). As the number of unknown parameters 
increases, the detection performance is degraded. Similarly, as the number of samples 
increases, the variance on the parameter estimate will decrease, and the estimation will 
be more precise. In addition to signal parameters that are unknown, a signal model may 
include additional parameters, in which we are not interested, that need to be estimated. 
These are nuisance p arameters, and further degrade detection performance. The clairvoyant 



detector jKay 



19981 . Ch. 6) assumes perfect knowledge of all parameters, and is useful as 
a comparison to the real detection performance obtained with the GLRT. The clairvoyant 
detector is an optimal detector, however the GLRT yields a sub-optimal detector, because 
estimation of unknown parameters is required. This observation is important when we wish 
to compare optimal detection performance with the performance we obtain with different 
detectors we might design. 



2.3. Bayesian approach 

In the case where some of the unknown parameters are random (as opposed to 
deterministic), one can use a Bayesian approach to remove them from the detector. In the 
Bayesian framework, each datum contains a realization of the random variable, and the 
likelihood function is conditioned on the parameter. One assigns prior PDFs to the random 
parameters, and determines the unconditional likelihood functions according to: 

L{x;Hi) = [ L{x\9;Hi)p{9)d9, (7) 
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where L{x\6] Hi) is the conditional PDF, p{6) is the prior distribution, and the integration 
is performed over the random parameter. If the mean value of the random parameter 
is unknown, one can initially estimate this using MLE. Thus, instead of calculating the 
MLE of the parameter to completely specify the likelihood function, one can remove it via 
integration. 



2.4. Detection performance 

As described in section 12. H the detection threshold, A is determined according to 
the acceptable false positive rate, and balances the probability of detection (Pd) with 
the probability of a false positive Pfa- It is important to quantify the performance of 
a detector, for comparison to other detectors. The optimal detection performance, that 
of the clairvoyant detector, can be calculated and used as a comparison for the realised 
performance, as an objective means to measure the utility of a detector. 



2.5. Estimation performance: Cramer-Rao lower bound 

The complete specification of the likelihood function will require the estimation of 
some parameters. Imprecise estimation will degrade detection performance. It is useful to 
have a sense of the ability to estimate the value of a parameter for given a dataset. 

To determine the theoretical optimal estimation performance with a given dataset, we 
can calculate the Cramer-Rao lower bound (CRB) on the precision of parameter estimates. 
The CRB calculates the precision with which a minimum-variance unbiased estimator 
could estimate a parameter value, using the information content of the dataset. It is 
computed as the square-root of the corresponding diagonal element of the inverse of the 
Fisher information matrix (FIM). The (ij)th entry of the FIM for a vector 6 of unknown 
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parameters is given by: 



[m]^ 



-E 



92logL(x; 6) 



(8) 



where E denotes the expectation value . For iV" independent samples in WGN and complex 



data, this expression simplifies to ( iKay 



19931), 



2Re 



a 



1 ^ ds"[n;0]ds[n;e] 



d9i 



d9i 



(9) 



n=l "J 

The CRB is a useful metric because it places a fundamental lower limit on the measurement 
precision of any parameter. In this work it will be used to gain an understanding of the 
fundamental limits of an instrument, and how these affect its estimation and detection 
performance. It has previously been used in astronomy to determine li mits on optic al 



astrometry with the WFPC2 came ra aboard the Hubble 



with focal plane array bolometers ( ISaklatvala et al 



20081 ). 



Space Telescope flAdorl 



19961 ). and 



3. Detection of sources in visibility data 

We describe here a method for detecting a point source within visibility data. In 
general, this requires detection of a source of unknown fiux density, spectral index, location, 
arrival time and duration (in the case of a transient), contained within confounding 
(nuisance) signals within the field. As described in section 12. 2[ the method involves 
maximum likelihood estimation of the unknown parameters, followed by a GLRT to decide 
the presence of a signal. We begin with the simplest problem of detecting a single point 
source in an empty field, and then add complexity. It is assumed initially that there are 
no calibration errors, source confusion or atmospheric effects present in the dataset. These 
errors will complicate the formulation of the problem, and they will be considered in section 

m 
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3.1. Detection of a single point source in an empty field 

3.1.1. Estimation of unknown parameters 

Detection of a single point source in visibility data requires estimation of unknown 
parameters, followed by detection of a complex signal in WGN. We assume we use data 
from one integration step, F frequency channels and N baselines. The location of the 
source and its amplitude (spectral flux density) are unknown, and need to be estimated 
before hypothesis testing. The data are modelled under each hypothesis as: 

H,:i[f,n] = S[f,n]+w[f,n] (n = 1, iV), (/ = 1, F) (10) 
Ho:x[f,n] = w[f,n], 

where the tilde denotes complex quantities. We will write the signal and data as real 
and imaginary components, under which the noise can be modelled as white gaussian. 
he signal, s\f,n\, is th e complex visibility for channel / and baseline n, and is given by 



(iThompson et al. 



20041 ): 



S[f,n] = V{ufn,Vfn) =11 A{l',m')I{l',m') ( 1 exp [-2TTi{ufJ' + Vfnm')]dl'dm 



(11) 

where A{1' .,m') and /(/', ml) are the antenna response function and source intensity function 
at sky position (/',m'), and the spectral dependence is modelled as a power-law with 
index a and normalized by the base frequency, vq. Assuming a point source located at 
(/' = l,m' = m), the visibility function becomes: 

V{ufn,Vfn) = A{l,m)I{l,m) i ) exp[-2TTi{ufJ + Vfnm)]. (12) 



The antenna response and source flux density functions are nuisance parameters for 
detection, and we combine them to form one scaling factor, B{1, m) = A{1, m)I{l, m), which 
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is independent of baseline. Hence, our model is: 



s[f, n] = V{ufn, Vfn) = B{1, m) ( ) exp [-27rz(u/„/ + f/„m) 



(13) 



We form the likelihood function under WGN, assuming initially that the noise variance 
is known and identical for all baselines and channels. The likelihood function, which is the 



joint PDF for F channels and baselines, is given by ( iKay 

N F 



19981 ): 



^x;i^o = nn^ 



exp 



n=l /=1 
1 



1 



— (£[/, n] - §[/, «])*(£[/, n] - n]) 



exp 



--(i-sr(i-s) 



a' 



(14) 



where H denotes Hermitian conjugate (complex conjugate tranpose), and the product has 
been collected in the matrix inner product. Substituting the signal, equation [131 iiito the 
likelihood function yields. 



1 f z 

L(x;iJi) = ^^^^^^^ exp(-- 



(15) 



where 



N F . . 

n=l f=l ^ ^ 



1^0 



exp [—27[i{Ufnl + Vfnfn)] 



X ( x[f,n] - B{l,m) { ^^^] exp [-2'n:i{ufJ + t;/„m)] ) . 



(16) 



1^0 



To obtain the values of the unknown parameters, {B,a,l,m), the maximum likelihood 
estimates of these parameters, {B,a,l,rh), are determined. Methods for determining 
these estimates are presented in Paper II. Here, we explore the precision with which these 
parameters can be estimated for a real instrument using the CRB formalism described in 
Section 12.51 
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3.1.2. Estimation precision for a point source 



For the signal, equation [131 the FIM for parameters (/, m, a, B) is given by; 



a' 



An^B^L,,, An^B^L 













v 













NB'^I.A NBI„ 



NBL 



Nil 



where /„ 



N F 

n=l f=l 

N F , 

n=l /=1 ^ 



2a 



/= 
AT F 

n=l f=l 



^0 



2a 



^0 



2a 



2a 



[log(K/)/^o)]^ 



F 

E 

/=1 



and Ji 



/=1 



^0 



2a 



2a 



(17) 



(19) 
(20) 
(21) 
(22) 
(23) 



Interestingly, the FIM does not depend directly on the source location (only indirectly 
through the antenna response function). This is because it is only the phase difference 
between antennas that is important. Instead, the baseline projections {uv) weight the 
information in each element of the summations. Intuitively this is because the longer 
baselines are sensitive to smaller changes in the source position, and therefore are weighted 
more highly in the information measure. Because the intensity scaling, S, is a linear 
multiplier for each signal, the information carried in the data about it scales directly with 
the number of baselines, and the noise variance. There is no covariance between the position 
parameters and the signal amplitude parameters. Therefore, any prior information on the 
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values of these parameters, will not affect one's ability to estimate the other parameters. 
Conversely, any factor that degrades the estimation of one group will not affect the other 
group. 

Inverting the FIM yields the following lower bounds on the precision of the parameter 
estimates: 

jl/2 

A/ > ^^[h^Iu^-iUT'^' (24) 

^jl/2 

Am > (25) 
zv ZttB 

Ac > ^M,-(«r"' (26) 

AB > ^ (/.)T''' (27) 

Note that the sky positions here are direction cosines, and are defined in radians for small 
angles, and B and a have the same units (i.e., Jy). This is a general expression for the 
theoretical maximum precision (estimation performance) on the position, amplitude and 
spectral index of a source in visibility data at {l,m), using F frequency channels and N 
baselines. The noise, a, is thermal noise per channel and baseline. These expressions 
exclude any systematic effects. They are the lower bounds on estimation of these parameters 
for an estimator that uses all of the available information in an unbiased manner. 

Equation |2l] suggests that a centrally concentrated array will have poorer astrometric 
precision than an array that is not centrally concentrated but has the same number 
of baselines and the same maximum baseline (i.e., one with uniform uv coverage). In 
conventional radio astronomy terminology, a naturally-weighted synthesized beam from a 
concentrated array will have a broader beam than a uniform uv array. These results agree 
with, and quantify, our intuitive expectations. 

The covariance (non-zero off-diagonal elements of the FIM) between the signal 
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amplitude and spectral index degrade the estimation performance for each parameter 
individually. If the spectral index is known, and for a = 0, the precision on the signal 
amplitude is given by AB > a/y/2FN i.e., it is proportional to the noise, integrated over 
all antennas and the total bandwidth. Introducing uncertainty in the value of the spectral 
index degrades the estimation of the signal amplitude, and vice versa. Figure [1] shows 
estimation precision for the 32-tile system (32T) of the MWA, where the phase centre has 
been set at the zenith. The MWA 32T has a linear extent of ~330m, and a synthesized 
beam at 150 MHz of 6'syn ~25 arcmin. Figure |2] displays the antenna positions for the MWA 
32T and ASKAP telescopes. Assuming a system equivalent flux density of 10,000 Jy, 1.28 
MHz channels and an 8 second integration, the noise in each of 24 channels is ~15 Jy. In 
all cases, the source has a constant amplitude of 5 = 0.7 Jy, translating to an integrated 
signal-to-noise ratio, S/N=5, over 30.72 MHz bandwidth (center frequency of 153 MHz) 
and using all 32 antennas. The upper figures display bounds as a function of the number 
of baselines, where the baselines have been ordered descending in length (i.e., the longest 
baselines are counted first), and for 24 frequency channels. The lower panels display bounds 
as a function of the number of spectral channels used for a fixed total bandwidth of 30.72 
MHz, constant source amplitude, and using all antennas. The following observations can be 
made: 

• Inclusion of the shortest baselines does affect estimation performance, due to the 
increase in sensitivity obtained by including additional antennas. This is important 
when considering removing short baselines to reduce the impact of diffuse emission on 
your signal; 

• The Fisher information on the source amplitude and spectral index are independent 
of the distribution of antennas in the array: they depend on the number of antennas, 
frequency channels and bandwidth, and have a l/y/N dependence on number of 
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Fig. 1. — (Upper) Bounds on unknown parameters as a function of number of baselines, where the basehnes have been 
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baselines; 

• Increasing the number of frequency channels degrades amplitude estimation 
performance, but improves spectral index estimation performance, due to the 
covariance of these parameters; 

• The position parameters have the same functional form, and improvement in one 
generally implies improvement in the other (unless the improvement is due to 
increased interferometer extent along only one axis); 

• In general, the uncertainty in the value of the spectral index is large compared with 
typical values, and this poor estimation is due to the high noise (short integration 
time) in this example; 

• The bounds for m are shghtly higher than for I due to the reduced spatial extent of 
the MWA 32T array in the t/-direction (elongated in the x-direction) , but both are 
~1 arcminute for S/N=5; 

• The CRBs on position estimates are much smaller than the synthesized beam of 
the telescope. This is because the CRB represents the maximal precision, and thus 
represents the performance of the optimal estimator (or deconvolution algorithm), if 
it exists (which it may not). The synthesized beam, or half-power beam width, is a 
coarse measure that only considers the contribution from the longest baseline in the 
array, without reference to the spatial location information carried by the shorter 
basehnes. 



These results are for a "perfect" interferometer, with no calibration errors and no 
systematic bias. Once additional errors are taken into account, the effective noise term (cr) 
will increase, and the estimation performance will be degraded. This will be closer to the 
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real situation in an interferometer, but will still omit any errors introduced by systematic 
bias i.e., the CRB is applicable only for unbiased estimation. 



Up to this point, we have considered the field to contain a single source, with unknown 
position and amplitude. We now generalise these results to the more realistic situation 
where the field contains many other static sources. Prom the perspective of transient 
detection, these static sources are nuisance signals that need to be removed. Removal of 
these signals might involve signal subtraction, or modelling, and this is a large field of 
current research, in itself. Algorithms that account for static sources will be explored in 
Paper II. 

These sources are modelled in the visibility data, and included in the GLKI as known 
parameters, viz; 



where is a vector that describes the parameters of the nuisance sources. The ML 
estimation of the source position and amplitude can proceed under this scheme as described 
above. 

Por K known static sources, and one unknown source, the signal is given by the sum 
of the complex visibilities for each source, 

s[f,n] = B{l,m) I 1 ex.p[-2TTi{ufnl + Vfnm)] + i^fccxp [-2m{ufnlk + Vfnmk)]. 



3.2. 



Inclusion of other sources in the field 




(28) 




fc=i 



(29) 



-23- 



3.3. Unknown signal cirri val time 

Transient sources, by their nature, appear at unknown times (for non-periodic sources) 
and remain visible for unknown durations. Including these parameters in the modelling 
necessarily requires extending the discussion to multiple integration steps. This has the 
benefit of increasing the number of data samples, thereby improving parameter estimation, 
but is comphcated by the evolution of (uv) as the Earth rotates. 

The ML estimation and GLRT detection test scheme described above, applied at 
each integration timestep (output of the correlator), naturally allow for appearance of a 
signal at a given timepoint. No signal-present will produce amplitude estimates within the 
noise level, and position estimates within the CRB of the phase centre. Appearance of a 
(sufficiently strong) signal will produce non-zero estimates and the test statistic will exceed 
the detection threshold. At this point, it is statistically advantageous to include all previous 
timesteps when a signal has been present in the ML estimation of parameters. The position 
of the source (/,m) and amplitude, B{l,m){v{f) /vq)°' , are constant over time. The GLRT 
then becomes: 

T N F 



1 1 

L(x, B, a, I, m, 6>; H^) = j^^^^^y^ exp <^ - — 



X] X] X] 

t=l n=l F=l 



(30) 



where 



■Z/nt = ( *1 - B(l, m) (— ^) exp [-27ri(uj:„(i + i;/„tm)] - V" exp [-27ri(uj:„(!j, + uj„tmi.)] 



(31) 



X ( n, t] - B(l, m) f — exp [-27ri(uf„ti + ■u/„t"»)I - [-2'ri(!if„tifc + n/ntiT-fe)] | . 

and t = (1, T) denotes the timestep, T timesteps have occurred since the initial detection 
of the transient signal, and the summation contains the contribution from the known, 
static sources within the field. The baseline projections are now functions of time as the 
Earth rotates, and are completely specified. Estimation of the transient source position 
and amplitude is extended to include all T timesteps, effectively reducing the estimation 
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uncertainty by a factor of r^l/\/T. Hence, at each integration timestep, after initial 
detection of a transient signal, two processes would occur. Firstly, the total dataset (since 
detection) would be used to estimate the values of the source parameters. Then, these ML 
estimates would be used in a GLRT, evaluated at the current timestep alone. This would 
allow a decision to be made about the continued presence of the source at that timestep. 
If all of the data were used for the detection, previous detections of the signal (in previous 
timesteps) would dilute the presence of the signal in the current timestep. In other words, 
if the signal has disappeared, it will be more difficult to detect this if all of the previous 
signal-present information is used. 



4. Other sources of uncertainty 

Up to this point we have considered only thermal noise in visibility data, however 
this is a simplification of reality. Amplitude and phase calibration errors, background 
confusing sources and atmospheric/ionospheric effects on the signal wavefront alter 
the noise properties of the visibility data. Note that here we use the term 'noise' in 
the general sense, including statistical noise, system noise and unresolved background. 
Accurate characterization of these effects is required to design an optimal detector. As 
well as designing an optimal detector, these effects introduce additional uncertainty to 
the modelling and will degrade the estimation and detection performance. In this section 
we use the Cramer-Rao bound to determine the precision on measuring calibration gain 
parameters, and include this additional uncertainty in the likelihood function describing 
the data. 

To include the additional uncertainty in the likelihood function, we add a term to the 
thermal noise that quantifies the uncertainty for each baseline and channel. In general, this 
is a covariance matrix, Cc, where non-zero off-diagonal terms quantify baseline-baseline 
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covariances. The likelihood function under the signal-present hypothesis becomes; 

m H,) = exp [-(i - ~sf{C. + a'inSc - s)] . (32) 

Multiplying the data model vector by the inverse of the covariance matrix prewhitens the 
data (removes the correlations between baselines, and weights each baseline according to 
the amount of information available about it). 



4.1. Calibration errors, confusion and atmospheric phase noise 



Liu et al.l ( l2010l ) have recently de scribed er r ors in t roduced by different calibratio n 



technique s , and extend earlier work by 



Cornwelll fllQSlI ) 



Cornwell and Fomalont 



(119891 ) and 



Wieringal ([1992) • Calibration errors can be classified into two types; (1) systematic bias, due 



to a biased calibration estimation method; and (2) estimation uncertainty (imprecision), 
due to limited information (with the Cramer- Rao bound as the lower limit). Systematic 
bias will shift the position of sources, but may not increase the model uncertainty. Bias on 
an antenna-by-antenna basis will blur the position of signals. We will consider unbiased 
estimation precision, and represent the amplitude and phase calibration errors as additive 
parameters with zero mean and known covariance. 

Low-resolution instruments may suffer from source confusion, whereby the density of 
background sources is sufficient to produce overlapping sources in the image plane through 
the source primary signal and sidelobes. Confusion-limited instruments have a natural 
detection hmit that corresponds to the confusion level, as opposed to the thermal noise 
level, which may be lower. The confusing signal is structured and behaves differently 
to thermal noise, because it corresponds to real signals. The MWA 32T and 512T are 
confusion-limited instruments. 



In general, the ASKAP instrument will not be confusion-limited, due to its higher 
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angular resolution compared with the MWA 32T and higher operating frequencies (this 
may not be the case for long integrations, but will be when considering each integration 
timestep independently). However, the higher frequencies are subject to tropospheric 
fluctuations, yielding visibilities that include atmospheric phase noise. This noise acts to 
blur the position of the source. The phase noise is a fun ction of the baseline length, ci, and 



its variance can be modelled by (iThompson et al. 



20041 ): 



2 _Afa^ 

where (3 is the index of the structure function describing the fluctuations (/3=5/6 for a 
Kolmogorov spectrum), and a is a scaling factor. The phase noise is largest for the longest 
baselines. A typical value for a of 10~^ corresponds to an rms phase noise of 2.5 degrees for 
the longest ASKAP baseline. 

Confusion will increase the effective noise level in the visibilities, and the rms confusing 
signal can be added in quadrature to the thermal noise. Atmospheric phase noise introduces 
additional uncertainty in the phase, and therefore in the argument of the trigonometric 
functions describing the signal. Before their inclusion into the detector, we derive the 
impact of calibration on source parameter estimation precision. 

There are two steps in determining the effect of calibration uncertainty on estimating 
the parameters of a source. The first is to determine how precisely calibration can be 
performed given a set of calibrators in the field. The second is to include this uncertainty 
in the overall system noise when estimating the source parameters. 



4- 1.1. Form of covariance matrix 

Primary (amplitude) calibration is achieved by observing a source of known flux 
density, typically at the phase centre, and adjusting the antenna-based gains to yield the 



-27- 



known flux density as an output. Secondary (phase) calibration can be performed in two 
ways: (1) observation of a bright point source at the phase centre, and adjustment of 
the antenna-based phases to be identically zero, and (2) self-calibration, using sources 
available in the field to produce a consistent phase solution. For single-dish instruments 
and interferometers with a small field-of-view, the former technique yields adequate results. 
For instruments with large fields-of-view, where multiple phase solutions are required 
(variation in calibration across the field), self-calibration will produce more accurate results 
at the edge of the field. Observations at frequencies below ~300 MHz have the additional 
complication of propagation delays introduced by the ionosphere. This delay shifts the 
position of sources in the sky, but does not blur the image. In this case, one forms a simple 
time-dependent model for the ionospher ic phase screen from sources in the field-of-view. 



and performs a 'field-based' calibration (IKassim et al. 



20071 ). This requires a high cadence 



of phase calibration to be performed, and to be practical, necessitates the field-based 
calibration method. At higher frequencies, phase noise caused by the neutral troposphere 
causes a blurring of the source position. 

An adequate distribution of secondary calibrators across the field should produce 
unbiased phase solutions, with the spatial variation accounted for in the solutions. The 
phase errors in this case will reduce to measurement errors based on the number and 
strength of the sources available. An inadequate density of sources may lead to large 
uncertainty on the phase calibration. Primary (bandpass) calibration typically employs 
a very strong source, occurs relatively infrequently (~hours), and is performed for each 
frequency channel. The high source signal-to-noise ratio will yield high precision on the 
calibration. The field-based calibration, however, is subject to short timescale atmospheric 
fluctuations. For these instruments it will employ sources of varying strengths, occur 
frequently (~10s), and will involve simultaneous solution for all antennas across the whole 
bandwidth. Here we will consider the effect of field-based calibration on source estimation. 
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The covariance matrix element for baselines i,j is given by; 

Cij = E[{xi - - (34) 

where the expectation value is taken over multiple realisations of the calibration. With an 
ionospheric model and a given density of static field sources, the covariance matrix can be 
approximated analytically. For more complex and realistic distributions, simulations can 
be used to calculate the covariance matrix empirically. Real datasets may also be used to 
quantify the covariance matrix: each independent integration in an observation of a static 
field can be used to approximate an independent noise realization, and the covariance 
matrix estimated using equation [341 For the purposes of this paper, and to demonstrate 
the magnitude and impact of these errors on source estimation and detection, we form an 
approximate analytical covariance matrix based on the theoretical precision with which 
calibration can be performed, and known expressions for the magnitude and distribution of 
uncertainty introduced by confusion and atmospheric phase noise. 

To approximate the form of the covariance matrix, Cc, we consider the measurement 
errors for amplitude and phase calibration for a given antenna, and use error propagation 
to express the variance for a given baseline. Errors in radio astronomy are typically 
antenna-based, however the covariance matrix we require needs to describe uncertainty on 
a baseline basis (since these are the data we measure). Note that the formulation of the 
problem and the error propagation take into account the connectivity of antennas: i.e., an 
error on one antenna will propagate through all of the baselines it forms with every other 
antenna. 
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4- 1-2. Cramer- Rao bounds on estimation of gain parameters 

We begin by calculating the theoretical optimal precision with which the amplitude and 
phase calibration can be measured for a single antenna, for an M-antenna interferometer 
and Nc calibrators, with positions {liy^,mNj and flux densities, BN^{v{f) /vq)"^^'^ . We write 
the complex gain for baseline n comprising antennas /3 and 7, as: 

Gn = G/sG^ = exp 2ni{(f)f3 - 0^), (35) 

where 6^ and 0/3 are the amplitude and phase gain parameters for antenna, /3. The joint 
PDF for all of the baselines is proportional to: 

F M M 

^ ryH r, 

'Pi 



L{x) oc exp 
where 



^ F M M 

5Z 5Z 5Z ^?7^/5 



2(t2 

f=l 13=1 75^/3 



(36) 



Z(3^ = -S^Bi I ) bpb^exp-2'n:i{uffj^li + Vfi3^mi + (f)i3-(j)^), (37) 

and there are M antennas. 

The Fisher Information Matrix is a 2M x 2M matrix to estimate all of the b and 
(f) parameters. There are no covariances between the b and (p parameters, so the FIM is 
equivalent to two M xM matrices. Therefore, there are two FIMs to invert, FIM;, and FIM,/,. 
Constructing FIM^ yields a singular matrix, due to the phases being relative quantities. 
Typically, the phase gain for one antenna is set to zero, and the others are defined relative 
to this. Hence, we set 0i = 0, and remove this parameter from the estimation (it is assumed 
completely specified). Therefore FIM^ becomes a (M — 1) x (M — 1) matrix. 

We derive the CRBs in Appendix |X] and present the solutions here. The general 
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solutions for the baseline precision Nc calibrators and M antennas are: 



a 

72 



/=i \j=i ^ ^ 



+ 'S2y2 BiBj i ) cos27r{ufai3ilj -k) + Vfai3{mj -nii)) 



1 -1/2 

(38) 



a 



A(j)aP > —j====y £ab...z^a^pAi^aA2,h---AM-l,z / V £^ab...2^1,a^2,6---^M-l,^, (39) 

V2(2vr)^ 

where e is the Levi-Civita permutation symbol, (a, 6, 2;) G [1, M — 1], there are implicit 
summations over all indices, and Aa^b are the FIM matrix elements, and are given by, 

M-l 

, biy^blXak, a = b 
Aa,b ={ , (40) 



-blblX^b, a ^ b 



and Xab is given by: 

F 



^ 52 W2 + E E ^^^4 ^ 27r(«;,,(/, - /,) + Vfabim, - m,)) 

i=l V / j=i \ / 



(41) 

The expression for the phase uncertainty is not easy to implement, and in practise, it 
is far simpler to form the FIM and invert numerically, and extract the baseline-based 
uncertainties using error propagation on the inverse FIM elements. 

Equations [3811391 are the baseline-based gain precision limits, and we have used error 
propagation (including covariances) from the antenna-based uncertainties to obtain these. 
These expressions make sense intuitively. If we ignore the cross-terms initially, the solution 
scales inversely with the total calibrator signal strength (Xl-^*)- The amplitude precision 
depends solely on the antennas forming the baseline, whereas the phase precision depends 
on the relative contributions from all baselines involving the antennas in question. This is 
due to the phase being a relative quantity. Increasing the number of antennas improves the 
estimation precision, as does increasing the signal strengths. The cross-terms weight the 
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contributions from individual antennas, according to the baseline projections on the vector 
between the calibrator sources. 

Up to this point, the noise parameter, cr, has referred to the thermal noise, which, for 
the MWA 32T is ~15 Jy per visibility in each coarse 1.28 MHz channel. However, the 
MWA will be confusion-limited, and the actual 'system noise' will be higher due to the rms 
fluctuations generated by the confusing sources. Assuming a confusion of 1 Jy/beam, this 
corresponds to ~100 Jy rms in each visibility. These 'noise' terms are independent, and can 
be added in quadrature to produce an overall system noise of ~100 Jy in each visibility. 

In Figure [3] we show the calibration amplitude and phase estimation precision, for the 
current MWA 32-tile system, and for varying calibrator numbers and strengths. For each 
plot, the precision is shown for thermal noise alone ('Therm') and for thermal+confusion 
('Con+therm'). The upper figures show the precision on phase and amplitude estimation for 
a single baseline, for a total calibrator flux density of 3.5 Jy, and as a function of number of 
calibrators. These are ensemble averages to remove the effects of calibrator position. These 
figures demonstrate that it is advantageous to have a single bright calibrator, rather than a 
few lower signal-to-noise calibrators. The precision scales as the square-root of the number 
of calibrators. The lower figures show the precision as a function of number of calibrators, 
where each calibrator has B=3.5 Jy. This corresponds to S/N=5 for an 8s integration. 
The precision here scales as the inverse square-root of the number of calibrators. In all 
cases, the amplitude gain parameters (6) are set to unity. Note that these plots are for a 
particular baseline. In general, the plots could vary substantially between baselines. If, 
for example, there was little calibrator information for a particular antenna, the precision 
will be degraded for the visibilities across all of the baselines it forms. Therefore, the 
connectivity (correlation) of the antennas is naturally accounted for in this formalism. 
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Fig. 3. — (Upper) Gain phase and amplitude precision for a single baseline, and all frequency 
channels, for a total calibrator flux density of 3.5Jy, as a function of number of calibrators 
(cr = 15 Jy/channel/baseline, a = 0, = 8s, F = 24, Au = 30.72 MHz). (Lower) Bounds 
as a function of number of calibrators, where each has a flux density of 3.5 Jy. 
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4.1.3. Inclusion of calibration uncertainty in estimation of source parameters 

Now we have quantified the uncertainty that field-based cahbration introduces into the 
data, the next step is to incorporate the additional uncertainty in measuring the parameters 
of a source. Previously, the Cramer-Rao bounds on estimating source parameters assumed 
only thermal noise in the system. To this noise we now add components that quantify the 
additional uncertainty. There are two sets of uncertainties to be introduced: a covariance 
matrix, Cc that quantifies the amplitude uncertainty, and a set of phase parameters that 
quantify the phase uncertainty (broadening the overall likelihood function acts on the real 
and imaginary components of the data, and therefore cannot easily include errors on the 
phase) . 

For a general covariance matrix and complex data, the general expression for the Fisher 
Information Matrix becomes: 



and Cc is the covariance matrix due to the amplitude calibration uncertainty. Note that 
the noise term, a, is the thermal noise — we assume that the background sources have 
been modelled and subtracted (including confusing sources). In practise, some level of 
background source will remain, and this also can be included in the modelling. 

The construction of the covariance matrix, Cc refiects the calibration uncertainties 
on each baseline (equations [38] - [39|) . For the pedagogical case we are considering here, we 
assume that the calibration process does not introduce any correlations between baselines 




(42) 



where we write the covariance matrix, C, as: 




(43) 
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or channels, other than through the common antenna gain term. This is equivalent to 
asserting that there are no baseline-based errors. In reahty, the off-diagonal terms will be 
non-zero, but small. We write the nth component of the diagonal covariance matrix as: 

[Ce]„ = B'Abl (44) 

where B is the strength of the source and A6„ is the calibration amplitude precision 
for baseline n. The source strength is here to have the correct units, and reflects the 
dependence of the absolute scale of the cahbration errors on the source strength. Using 
this scheme, the system noise for low signal-to-noise ratio sources will be dominated by the 
thermal noise, whereas high signal-to- noise sources will have a relatively larger calibration 
error component. 

The phase uncertainty is modelled as an additional parameter, ipn, in the argument 
of the exponential in the signal. This is a random (as compared with a deterministic) 
parameter, for which we possess prior knowledge (the phase calibration uncertainty), 
and is dependent on baseline, n. Note that we are referring here to uncertainty in a 
statistical sense: we do not wish to estimate the actual phase fluctuations for each antenna 
(which are constrained by closure phase, and are nuisance parameters), but instead want 
to understand the additional uncertainty they introduce. Instead of estimating the four 
deterministic parameters of the transient source {B, a, 1,171), we simultaneously estimate 
these parameters and the N random phase parameters, ipn- We use the prior knowledge of 
how these parameters are distributed to include additional information in a modified Fisher 
Information Matrix. 

The CRLB cannot be extended easily to include prior information. An equivalent 
expression for random parameters is available using a Bayesian approach where the 
probability distribution function describing the data includes the probability distribution 
function of the parameter. This approach allows prior information on the value of the 
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parameter to be incorporated into the bound. iRockah and SchultheissI (Il987| ) introduced 
the Hybrid Cramer-Rao lower bound (HCRLB) as an extension to the CRLB that allows 
estimation of both random and deterministic parameters. The probability distribution 
functions of the random parameters can contain prior information on the distribution of 
that parameter, and improve the estimation performance. In practice, this is achieved by 
the Fisher Information containing contributions from both the data (classical CRLB) and 
the prior information. 



The modified FIM is given by: 



/(0)' = E^[/(0)]+JprW, 



(45) 



where I{6) is the classical (data) FIM, and /pr('j/') is the prior information, and is given by: 

d log ]?(?/?) ^ d log 



(46) 



for component ij. The expectation over the random parameter of the data component is 
often omitted for tight prior distributions, resulting in a modified FIM that is the sum of 
the data and prior components. For gaussian PDFs with variance, o"^, the prior information 
is: 

I,,{^^) = l/al^. (47) 

As discussed above, there is no covariance between the source position parameters (/, m) 
and the source amplitude parameters {B, a). The random phase parameters, ipn, also do not 
co-vary with the amplitude parameters, and their inclusion therefore has no impact on the 
ability to estimate them (the additional amplitude uncertainty does affect all parameters, 
however). In the case of calibration phase uncertainty and atmospheric phase noise, the 
prior PDF is broadened to include contributions from both uncertainties. 

We now form the (A^ -|- 4) x (A^ -|- 4) FIM for the information carried in the data about 
the source parameters, and invert to yield the lower bounds on parameter estimates. This 
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FIM now includes the effects of amplitude and phase calibration. Figures ll](a-c) display 
the maximum estimation precision for a source, as a function of source signal-to-noise ratio 
(thermal noise), for system noise being purely thermal, and for thermal+calibration errors. 
In these figures, the calibration is performed using five 1 Jy (each with S/N=7) sources 
in the field, and is performed every 8 seconds. It is clear from these plots that the MWA 
32T is dominated by thermal noise and confusion for this case, and that the field-based 
calibration is not a significant source of uncertainty for low signal-to-noise ratio sources. Of 
course, a reduction in the number or strength of calibrator sources will affect these results. 
As mentioned earlier, the number of calibrators considered here are those within some scale 
on the sky over which the atmosphere can be considered stable. This group of calibrators 
can then be used jointly to estimate the gains. For stable atmospheres, and large patches 
of sky, the number of calibrators will be large, and the calibration precision will be high. 
For an unstable atmosphere with small patches, the number of calibrators will be low, 
and the calibration precision will be low. Therefore, this formalism includes the effects of 
position-dependent calibration in a rudimentary way. 

Figure Hl^d) shows the ideal precision for estimation of source position (Z) for ASKAP 
at 1.4GHz (Az/=300MHz, 32 channels, 5 second integration), and including three levels of 
atmospheric phase noise. In the case of the ASKAP system, the calibration uncertainty 
is small, and the additional uncertainty on the source position is dominated by the 
atmospheric phase noise. 

5. Optimal detector 



We have presented an analytical model for the impact of calibration, background 
confusing sources and the atmosphere on source estimation and precision. With this model, 
and the statistical framework developed in section [21 we can describe the optimal detector 
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Fig. 4. — (a) Precision in estimating the source sky position (l) for a source as a func- 
tion of signal-to- noise ratio, for calibration based on five 1 Jy (S/N=7) sources {a — 15 
Jy /channel/baseline, a = 0, At = 8s, F = 24, Au = 30.72 MHz), (b) Amplitude (B) preci- 
sion, (c) Spectral index (a) precision, (d) Sky position (/) precision for ASKAP, including 
three magnitudes of atmospheric phase noise (a=10~^, 5xl0~^, 10~^). 
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for visibility data. The signal-present hypothesis likelihood function for one integration 
timestep can be described by: 

Li±; H,) = exp [-(i - s)^(a + a^ir{± - s)] , (48) 

where Cc contains the amplitude calibration uncertainties, and the signal for channel / and 
baselines n is given by: 

■^[/i — B{1, m) ( — \ cxp [ — 27vi(u j^j^l + uy^m + tpn)] + ^ ^ -B^ exp [ — 27ri(ii f^i^k + fn'^k ~l" V'n )] ; (49) 
^ "0 / fc = l 

and the phase parameters are gaussian distributed according to the calibration and 
atmospheric phase uncertainties. The likelihood function for the signal-absent hypothesis, 
ifo) has the same form as equation HHl and the 'signal' is given by: 

K 

= ^5fcexp[-27ri(M/„4 + w/„mfc + ?/'„)], (50) 
fc=i 

(i.e., background static sources). Note that both likelihood functions require the phase 
uncertainty terms to be removed. 

To perform the detection, one needs to estimate or remove all of the unknown 
parameters. The unknown source parameters are estimated using maximum likelihood 
estimation (ideally), and the random phase parameters are integrated out, using their 
prior PDFs and the Bayesian approach described in section 12.31 Algorithmically, the 
unknown deterministic parameters are estimated first, with the phase parameters set at 
their mean values {ipn = 0) for simplicity (in practise, the phase errors will be small, and 
this simplification will have minimal impact on the detection performance). Then, the 
one-dimensional integrals are performed to remove the N random parameters. Finally, 
the GLRT is performed by taking the ratio of the values of the likelihood functions, and 
the result compared with a threshold. In Paper II we use the results derived here to form 
realistic likelihood functions, and present algorithms for implementing the optimal detector. 
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6. Conclusions 

We have presented a framework for designing optimal source detectors with visibility- 
space data from interferometric arrays, and applied this to describe a realistic optimal 
detector. Working in visibility space allows a more natural characterisation of the data 
likelihood functions than image space, where noise is structured and not well-understood. 
Source detection is complicated by unknown source strength, spectral index, position, arrival 
time and duration (transient sources), and the uncertainty on these parameters reduces 
detection performance. Estimation of these parameters is required before signal detection 
can be performed. Uncertainty introduced by field-based calibration, confusing sources 
and atmospheric phase noise further complicates signal detection and reduces detection 
performance. We have explored the impact of these additional sources of uncertainty on the 
ability of an efficient estimator to determine the parameters of a source, and applied these 
methods to two SKA pathfinder instruments: the MWA 32T and ASKAP. We then used 
an understanding of these effects to present a realistic model of visibility data, and design 
an optimal detector. 

Appendix 

A. Derivation of calibration precision 

We have calculated (and will present below) the estimation precision for a single 
calibrator, and two calibrators. Prom this, we can extrapolate to Nc calibrators, based on 
the form. The precision with which the calibration solution for a given baseline can be 
theoretically measured (since this is the data we measure) can be calculated using error 
propagation from the bounds on the individual antennas alone (and the covariances) . Por 
a single calibrator, and three antennas, the precision with which the gain amplitude can 



theoretically be measured is given by: 



a 



B. 



(Al) 



i.e., the inverse signal-to-noise for complex data. For two calibrators, the expression 
becomes: 



a 



V2 



./=i 



+2B1B2 



1^0 



2a2 



01+02 



COS 2'K{Ufal3{h - h) + Vfa0{m2 - mi)) 



-1/2 



(A2) 



Extrapolating to Nc calibrators yields: 



A6„, > ^ 



./=1 \i=l 

Nc Nc 



2a, 



+ ^ BiBj ( — — ) cos 2Tr{ufai3{lj - k) + Vfai^irrij - rrii)) 



i=i jj^i ^ " / 

which, for Nc identical, co-located calibrators with strength B and a — 0, gives 
a/{V2FNcB). 

For the gain phase precision (setting 0i to zero) and one calibrator, the precision is: 

1 



-1/2 
(A3) 



6^ + 62 



^/Wn\'" ^^^^^ Vbl + bl + bl 



(A4) 



/=i 



For two calibrators, this expands to include the cosine cross-terms, and is given by: 

1 



A(/.i2 > 



23 



VS^ bib2 ^blXi2Xis + 62X12X23 + 61X13X23 ' 



(A5) 



where 
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/=i 



cos 27r{Uaf,{^2 — ^l) + ^ah(^2 — ^l)) 



Extrapolating to Nc calibrators gives: 



12 



> 



(A6) 



(A7) 



where 

F 

Xab = 

/=1 



' Nc / r r\ \ 2ai Nc N, 



1=1 



+ ^ ^ BiBj ( — ) cos 2TT{uab{lj - k) + Vab{rnj - rrii)) 



i=l jjti 



Note that this is still the solution for a three-antenna system. The general solutions for N^. 



calibrators and M antennas are: 



A6^^ > 



71 



' F ( Nc 

./=i \i=i 



2ai 



Nc Nc 



^0 



COS 2n {uf {I j - k) + Vfi3^{mj - nii)) 



-1/2 
(A9) 



A0a^ > 



^ab...z^a^l5Ai^a^2,b---A 



M-l,z 



£ab...z^l,aA2h---AM-l,. 



(AlO) 



where e is the Levi-Civita permutation symbol, {a,b,...,z) G [1,M — 1], and Aa^ are the 
FIM matrix elements, and are given by. 



M-l 



blJ2^kXak, a = b 
-blblXab, a^b 



(All) 



where Xab is the same as in equation IA8I There is an implicit summation over all indices in 
equation lAlOl viz, 

M-lM-l M-l 

S^a6...z^l,a^2,fe---^M-1,2 = ••• £^a6...z^l,a^2,6- • -^M-l.z • (^12) 

a=l 6=1 2=1 

The expression for the phase uncertainty is not easy to implement, and in practise, it 
is far simpler to form the FIM and invert numerically, and extract the baseline-based 
uncertainties using error propagation on the inverse FIM elements. 
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